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ABSTRACT 

A statistical test for cheating is developed. The 
case of a single examinee who has taken parallel forms of the same 
selection test on three occasions, obtaining scores x, y, z, is used 
to illustrate the development* It is assumed that each score is 
normally distributed with the same known variance, that is, the 
variance of the errors of measurement. These scores are further 
assumed to be distributed independently, since each score differs 
from its mean (true) value only because of errors of measurement. 
Based on these assumptions, a significance test is presented to 
indicate evidence of cheating. Mathematical derivations for the test 
of significance are presented as well as a numerical example. 
(Author/RC) 
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A single examinee has taken parallel forms of the same selection 
test on three ocrasions^> obtaining scores x ^ y ^ z . We assume that 
..each score is normally distributed with the same known variance a ^ 
the variance of the errors of measurement. The scores are assumed to be 
distributed independently, since each score differs from its mean ('true' ) 
value only because of an error of measurement. 

Either 

: " all three, scores have the- same mean |i 
or . ' 

: scores x and ^ z have mean \i and score y has mea'n v 

( V > ). ' ' ' ■ - ' ■ 

We wish a significance test for the null hypothesis . In practi-?e, 

rejection of Hq is considered evidence of cheating on test y • 

2 . Reparame ter iza t ion ■ ' 



Define 0 E v - |i -.and consider 0 and |i as the (unknown) param- 
eters. Now is 0=0. This is in standard form. for a composite 

hypothesis about 0 . -The unspecified (* nuisance* ) parameter is |i . 



3 * Transformation of. Sample Space 

Let us replace the sample observations x , y ^ z by th§ trans- 
formed observations 



(1) 



(x + y + z)/5 , 
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(2) t = (y - m)/a . 

z (-X t 2y - .z)/a , \ 

(5) d = (y - x)/a V2 / ■ ' / 

\ 

This transfonnation will turn out to be useful in se'ftting up the desired 
significance test. (The reason x- and z are not treated symmetrically 
in (5) is discussed in Section 10.) 

k. Distribution of Sample under 

Prom (l)," (2)^ and (5) we find the means^ variances^ and covariances 
of m ^ t , and d under H^^ to be 




The joint distribution of m ^ t , and d under is normal tri- 

variate v/ith the parameters given by (^). 

5. A Sufficient Statistic under 

Since t and d have zero covariance with m under ^ they are- 
distributed independently of m. . It appears from (k) that under y 
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the Joint distribution of t and d does not depend on p . Thus m 
is a sufficient statistic for \i under IL . ■ 

6> Uniformly Most Powerful Significance Test 

.Because of the sufficiency of tn for the nuisance parameter \i , 
the pro^Dlem of finding a uniformly most powerful significance test for the 
c:)mposite hjviothesis : 0 ~ 0 is reduced to the problem of finding 

'a best critical region for testing the simple hypothesis that 0 = 0, 
m . being held constant (Kendall and Stuart, 1975, section 17.20; 
Lehmann, 1959^ section h.^). The best critical region for the' simple 
hypothesis will depend on the conditional , distributions of the trans- - 
formed observations for given m . = • 

... 7- Conditional Distributions of Observations ' 



Both t and d have zero covariance with m under either . Hq or 
. Thus the distribution of t and d is not affected by holding m 
c6nstant. ^^nder hL / the rr.eans of t and d are seen to be 
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= 0/o ^/572 , 



The variances and covariances u-nder li^ are the same as those under 
Hq , listed in (4). • • ^ 

'Jnder fnv , the joint dir- tribution f^(t,d) of t and ~ d 
(whether conditional on m or 'onconditional) is normal bivariate so that 
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(6) log f,(t,d) = K - ^— ^ [(t - \i f + (d - M..f ■ '■ 

. 2(1 - p^) " 



2p(t - |i^^)(d -'^x^^)] 



where K 'is a constant and 



P = °dt/Vt = 2^ • 



Under H , \i and p. are replaced by so the 

U Xu Xcl ot od. 

corresponding equation under is 

.(7) log f^(t,d) = K - ^^r— (t^ + d^ - 2ptd) - 

■ 2(1 - ■ 



8. Likeliho©d Ratio 



From (6) and (7,^ the logarithm of the likelihood ratio_is 



/ E log - log .f^(t,d) 



Substituting from (h) and (5); t 



I, a -2t0/a 75/2 ^ 2e^'/3a^ - 2de/a ^/2 + Q^/2a^ 

^ n/3 to/a ^f2 + ^ dQ/a y/3/2 - ^ Q /a S • . 

The terms involving d drop out (this was the reason for choosing 
definition (2) for t ) and after simplification ' 

r • 

\ 

■> 

(8) / a 0/a - t . 

9* Best Critical Region 

The best critical region for the simple hypothesis^ and thus the 
uniformly most powerful significance test for the composite hypothesis,' 
is defined by / ^ constant when H holds (Kendlall and Stuart^ 1973 • 
section 22.10.; Lehmann, 1959^ section 3»5)» Since 0 = 0 under / the 
best critical region can be defined by ■ t > > the constant being ■ 

chosen so that Prob(t > k^lH^) = a , the 'chosen significance level. . 

- : \^ — . — • 

10* Truncation on d 

In practical work, it' is sometimes a practice to make no investigation 

of an examinee unless ■ d > d \vhere d is some predetermined value 
" ' o o ■ 

(this is the reason for dealing with the rather awkward variable d .- 
throughout this report). .This avoids searching out z for Idrg^ numbers 
of individualFs ' for v;hom it is highly unlikely that t > k 



Under this truncation d > d , the situation in section h is the 
same as befo^re except that now d and t have a singly truncated normal 
bivariate distribution^ independent of m ^ zhe truncation being on d . 
-S-inc-e — d — harS- sero-covaxiaTic^-Tdrth m~7~reff1:rlc"tlon~'on""'d'~"do¥^^^^ prevent 
m from being a sufficient statistic for [i under ♦ Thus^ the problem 
is still reduced to the problem of finding the best critical region for , 
the same simple hypothesis. ' . ... ' ' ' - 

.The bivariate distribution of t and d is now proportional to 
f^(t^dy of (6; except that f^ is zero when d < d^ . A similar change 
occurs for f^(t^d) ^ so- that the likelihood ratio / remains unaltered. 
The critical region is still defined by t > ^ vhere now 

* 

Prob'.(f'> k' Ih^, d > d ) = a ' - . ' ' 

^ — o 0^ — o' 

implies either a different k^ for given a or a. different a for given 
k than was used in the previous section. ■ *. , 



11. SmallTSampIe P rTTper tires- 



Ut should be noted, that the best critical region and the uniformly 
most powerful significance test are chosen for their optimum properties^ 

which do not requirp large sample size. It» gan be shown that t is a 

■ ■ ■ * * ■ /" 

safficient statistic for 0 when m is fixed. . / 
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. 12* Numerical Example 

• ^ It is- kno\7n from reliability studies that the errors of measurement 
of Scholastic Aptitude Test scores- have a standard deviation of roughly 
CT = 15 /S" 52 on the CEEB score scale. Suppose x = hOO , y =r 6lO , • 



z = if 50 . Now m = ^80 , t = 5' . Under^ , - t has a mean of 0 and 
a standard deviation of 1. Thus Prob(t > 5Ih^) is .0000005* 



\ 
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